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PART | — EXECUTIVE SUMMARY AND QUESTIONS FOR MAC 


Since early 2000, the ABS has expanded significantly the resources it devotes to 
analytical work. The main aims of this increased investment are to : 


e fill gaps in our suite of statistical products — say, by constructing measures of 
socioeconomic concepts that cannot be delivered directly from our censuses 
and surveys, or generating estimates for smaller domains (smaller areas or 
subpopulations, etc.) than can be supported by our direct collections; 


e enhance the quality of our statistical products. 


And recently the economic and social subject matter divisions of the ABS were 
re-organised, partly with a view to freeing resources for additional analytical work. 
So it is timely for the ABS to consider what kinds of analyses it should undertake 
and how their relevance and quality can be assured. Two issues are prominent in 
our thinking : 


e The "bounds" of the ABS analysis program. What varieties of analyses are 
legitimate for a national statistical agency to undertake? 


e The "focus" of the ABS analysis program. What varieties of analyses would it 
be most fruitful for a national statistical agency to undertake? 


> In this paper, we seek MAC members’ views on methodological 
considerations that affect the bounds and focus of the ABS analysis 
program. 


We would welcome members' comments on any aspect of this issue, and in 
particular on the following questions : 


1. Are there any kinds of analytical work that would be unsuitable for the ABS to 
undertake? If so, how would such work be characterised? 


0 by technique? — difficult to master; novel, embryonic or still 
controversial; not yet implemented in standard computing packages; 
etc. 

o by subject matter or underlying socioeconomic theory? 


. What kinds of analytical work would be it be most fruitful for the ABS to 
undertake? 


How can we ensure that the analyses we undertake are professionally 
defensible — including ways of : 


oO defining the quality of our analytical products and advice? 
oO obtaining peer review and other quality assurance? 
o making the quality characteristics of our analytical products visible? 
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PART Il — BODY TEXT 


A. ABS Thinking about the Bounds and Focus Its Analysis Program 


1 In a 1993 speech, Ivan Fellegi (head of Statistics Canada) listed eight 
reasons for a statistical agency's undertaking analysis : 


Some forms of analysis needed by the user community can be undertaken 
only within the statistical agency, because they require access to the full 
microdata. 


Some important, mainstream statistical products are inherently analytical 
constructs and cannot be delivered by classical estimation from censuses 
or surveys. 


Analysis can bring to light quality concerns with statistical products — and 
can suggest ways of addressing those concerns. 


Analysis can bring to light gaps in the information available to 
decision-makers — and can construct products that fill those gaps. 


Analysis can extend estimates to domains (say, geographic areas or 
subpopulations or subindustries or time periods) which classical 
survey-based methods cannot (or at least do not) support. 


Analysis can help build bridges between the statistical agency and its user 
community (especially sophisticated or high-end users in policy 
departments and the universities). 


Analysis can engender a deeper understanding of socioeconomic subject 
matter within the agency, and can enhance the ways in which traditional 
survey processes, development of classifications and frameworks and 
other tasks are undertaken. 


Analysis can bring to light key features of the agency's datasets and can 
help the agency highlight and promulgate the results of its major data 
collections. 


2 During the past three years, the ABS has expanded significantly the 
resources it devotes to analytical work. We condense Fellegi's eight points into two 
— the ABS' expanded analytical program fills gaps in our statistical product mix and 
enhances and extends our existing products. 


3 During their first three years, the ABS analytical groups have been working 
through a menu of analytical problems devised by senior managers in late 1999. We 
believe that we have delivered some valuable new analytical products, and have 
helped improve other parts of ABS outputs and processes. We have largely 
exhausted the 1999 menu. 


4 It is a good time to take stock of whether the analytical program has been 
doing the right things, and whether we have been doing things right. Such a 
stocktake has many dimensions, and we are adopting a variety of approaches to our 
evaluation. For example, a project team of participants in our premier leadership 
program is addressing such issues as : 


e The risks that an analytical program may pose to the reputation of the 
ABS. The status and marque that should be accorded to research papers. 
Clearance procedures. 


e How the quality of analytical work should be managed. The role of 
external advisors, reference groups and peer reviewers. 


See Attachment 1 for the project team's terms of reference. 


5 Two issues have arisen prominently in our stocktake, and we want to obtain 
MAC members' views on them : 


e The "bounds" of the ABS analysis program. What varieties of analyses are 
legitimate for a national statistical agency to undertake? 


e The "focus" of the ABS analysis program. What varieties of analyses 
would it be most fruitful for a national statistical agency to undertake? 


6 Section (B) of the paper describes some major styles of analytical projects. A 
list of our past and present projects is in Attachment 2, and some of these are 
discussed in greater detail in Attachment 3. (We have provided this material so that 
MAC members can consider the issues in the light of concrete examples of work we 
have undertaken.) Attachment 4 outlines the ways in which the ABS decides what 
analytical projects to pursue. 


7 Section (C) lists some analytical methodologies that we think will be 
prominent in our work during the next couple of years. 


8 Section (D) suggests some methodological considerations relevant to defining 
the bounds and choosing the focus of ABS analytical work. Most prominent of these 
considerations is whether we have (or can acquire or co-opt) the technical capacity 
to deliver good quality, professionally defensible work of such-and-such a style. 
Attachment 5 provides examples of the peer review and advisory panels we have 
used to check the direction of our analytical projects and the quality of our outputs. 


B. Some Styles of ABS Analytical Work 


9 The discussion is grouped under four analysis themes. Attachment 2 lists 
some other styles of work and provides case studies. 


(a) Exploiting By-product Datasets 


10 Traditionally, national statistical agencies have relied largely on their own 
censuses and sample surveys when compiling economic and social statistics. Such 
direct collections will remain a major element of ABS operations. But government 
departments and businesses are accumulating large databanks that potentially have 
considerable value for statistical purposes. Their main purpose is to assist 
management of the department's own business operations, although some 
departments are now extracting performance information and other statistics. The 
ABS is exploring possibilities for using administrative by-product data to enhance the 
national statistical service. 


(b) Exploiting Melded and Multiple Datasets 


11 To answer some analytical questions, we must use multiple datasets. This is 
especially true when the question spans multiple domains (having, say, both social 
and economic aspects) or when we wish to develop estimates for smaller domains 
than a single collection will support. Statistics compiled at the whole-of-Australia 
level satisfy the needs of many decision-makers and researchers. But other users 
need data dissected by geographic areas (say, States or regions), by 
subpopulations (Say, age-sex groups or household types) or by industries or other 
dimensions. It is not possible to run a census for all socioeconomic topics, owing to 
the prohibitive financial cost and the load on households and businesses that 
provide the data. And while ABS sample surveys can deliver somewhat 
disaggregated estimates, there is an ever-growing user demand for more and finer 
dissections. 


(c) Model-based Data Construction 


12 Policy-makers and researchers appeal to a wide variety of socioeconomic 
concepts, not all of which can be measured directly in statistical surveys; in many 
cases, the survey data must be transformed or modelled to meet users' needs. The 
ABS already publishes statistical measures for many concepts that arise in 
economic or social theory and underlie policy design — these include aggregate 
economic activity, productivity, inflation, income distribution, life expectancy, the 
energy intensity of production and so on. 


(d) Analytical Products that Cut Across the Economic/Social/Environmental 
Domains 


13 The ABS publishes a rich suite of statistical products describing major 
aspects of Australian life — the economy, society and environment. There is a 
growing demand for statistical products that draw information together, regardless of 
source. Such ‘integrating’ statistical products help decision-makers and the 
community form a more comprehensive view of some aspect of life; they also help 
researchers analyse the interactions between key variables. 


Topics and Styles of Analysis — Have We Overstepped the Bounds? 


14 ‘In our opinion, none of these styles of analysis is inappropriate per se for the 
ABS. But particular projects of each style might sail close to the wind or pose a 
hazard to the ABS reputation for objectivity : 


e When we are analysing administrative by-product datasets, our clients are 
often most interested in analyses that will guide the design or evaluation of 
policy. We want our analyses to be relevant to these important applications. 
We must ensure that the ABS informs policy — but does not become 
entangled in it. For example, our analyses of FaCS-Centrelink data carried 
implications regarding the efficacy of the "activity testing" of welfare 
beneficiaries, but our research report had to adhere to a strictly objective 
analysis of patterns in the data. 


When we are analysing melded and multiple datasets, we must ensure that 
our linking across databanks does not violate (or create the impression that 
we are violating) confidentiality undertakings. For example, our analyses of 
linked hospital-Medicare-Pharmaceutical Benefits data could not commence 
till they were checked for consistency with the Privacy Commissioner's 
guidelines. 


When we are constructing data based on a socioeconomic theory or model, 
we must ensure that there is broad consensus support for the model we 
have adopted — or at the least ensure that the dependence of our estimates 
on certain assumptions is completely transparent. For example, our work on 
human capital has adopted a very particular approach (namely, basing the 
value of human capital on its economic benefits) which not all commentators 
would accept. 


When we are doing cross-cutting analyses to inform community discussion 
of important socioeconomic questions, we must ensure that the ABS does 
not appear to have taken sides on contestable issues. For example, a few 
commentators have said that the term "progress" is irreducibly value-laden 
and that —even by providing a suite of progress indicators from which 
readers can choose— the ABS has taken sides. 


> We would welcome MAC members' advice on whether there are any broad 
topics or styles of analysis that we should steer clear of. 


C. Some Emerging Technical Themes for ABS Analytical Work 


15 Much of the ABS analysis program relies on traditional techniques from 
mathematical statistics, econometrics, time series analysis and other disciplines. But 
some of the technical matters arising in the program are novel (not having been 
addressed until fairly recently in the literature) or at least new to the ABS (not yet 
having been applied to the development of the bureau's statistical products). 


Multilevel analyses 


16 Some of our projects are trying to construct socioeconomic measures or 
analyse data patterns at multiple geographic levels (say, both States and Statistical 
Local Areas) or for multiple units (Say, both persons and households). The 
relationships between variables can be quite complex. For example, the probability 
of falling victim to a crime may be influenced both by the characteristics of individual 
people and by the characteristics of the areas in which they live. Moreover, the 
strength of the various influences may rise or fall as one changes the unit of analysis 
from individuals to households or as one moves from coarse to fine geography. 


Longitudinal analyses (of longitudinal and quasi-longitudinal datasets) 


17 Some of the socioeconomic questions we are asked to address (say, about 
labour market experience or the rise and fall of businesses) are analysed most 
naturally from a longitudinal perspective. During the past decade, the ABS has run 
two major longitudinal surveys — the Survey of Employment and Unemployment 
Patterns and the Growth and Performance Survey (also called the Business 
Longitudinal Survey). We continue to base some analyses on these — most 
recently in a joint ABS-Productivity Commission project about the influence of 
information technology on business performance. We also have (or may obtain) 
access to other databanks such as the Longitudinal Dataset of FaCS-Centrelink 
customers and the Household, Income and Labour Dynamics in Australia Survey. 
Equally, we are interested in whether it is possible to exploit data other than those 
gathered from a truly longitudinal survey to answer questions of a broadly 
longitudinal character. 


Estimating for small domains 


18 There is a large and growing demand for estimates that relate to smaller 
domains (chiefly small geographic areas, but also subpopulations and subindustries) 
than can be supported by ABS surveys. During the past few years, the ABS has 
done a good deal of work of this kind — MAC members may recall the 
Bell/Pfeffermann work on labour force trends and the Tanton/Jones work on crime 
rates for small areas. 


19 We are not satisfied that we have an adequate understanding of the 
estimation methods that should be preferred in given circumstances. State-of-the-art 
methods can be complex and too expensive to apply to more than a small number 
of the demands that we must satisfy. So we must achieve a better understanding of 
how we might use to generate defensible or usable estimates in production mode — 
that is, whether we can achieve approximately-right answers using simpler methods. 
We intend to make a serious assault on these questions during the coming year, 
and shall submit work-in-progress papers for MAC members' comments. 


Analysing by-product datasets 


20 As mentioned earlier, exploiting administrative and business by-product 
datasets for statistical purposes is a major theme in our work program. Over the 
years, the ABS has developed a large array of tools (mathematics, procedures and 
software) to analyse datasets collected through the bureau's own censuses and 
sample surveys. The question arises, however, whether those tools remain 
appropriate when we must deal with very large by-product datasets : 


e How might traditional models and methods have to change to deal with 
datasets that have not been assembled using ABS classifications, 
definitions and collection methods? 


e What methods are needed to assess the quality of the datasets (and 
especially to detect any drift in quality as time passes)? 


e How can our analyses deal with the fact that the data may be partial 
(because the databank covers only the customers of a department, not 
the whole population) 


e How might traditional research strategies have to change? For example, 
might the bulk of the exploratory analyses be done on sampled datasets, 
and the preferred or final model be validated against the full dataset? 


Analysing huge datasets 


21 The datasets being used in some analytical projects —especially the 
transactional and customer databanks— can very large. Exploiting the statistical 
potential of such datasets may prompt some reconsideration of ABS research 
strategies, analytical techniques and software tools. 


Analyses that take account of complex survey design 


22 The sampling designs for some ABS surveys can be quite complex. When it 
later comes to analysis, however, many standard techniques for fitting and testing 
models ignore the complex sample design — in effect, it is assumed that the data 
have been drawn by simple random sampling. This expedient can lead to invalid 
inferences about the explanatory power of one's models; it may even lead us to 
choose the wrong model. 


Advice from MAC and other experts regarding analytical technique 


23 During the past year or two, we have been scouring the literature, evaluating 
software and running pilot projects to acquaint us with these issues. We have been 
building our knowledge, but do not yet have a solid or confident grounding. In some 
cases, the issues are still being developed or debated in the literature, and software 
packages do not yet embody all the preferred techniques. We shall bring papers on 
our research strategies and pilot projects to MAC during the coming year. 


Analytical Techniques — Have We Overstepped the Bounds? 


24 In our opinion, none of the techiques listed above is inappropriate per se for 
the ABS. Some other techniques that we have been asked to adopt may be sailing 
close to the wind — for example, applying data envelopment analysis or stochastic 
frontier modelling to assess the performance of government service providers. Our 
rule is that we will provide technical advice and will undertake objective components 
of the analyses, but drawing the policy implications is the responsibility of our 
clients. 


> We would welcome MAC members' advice on whether any analytical techiques 
are So complex (or consensus on preferred technique is still so far away) that we 
should steer clear of them. 


In our opinion, small area estimation and analysing complex surveys may fall into 
this category at present. But we face strong and growing demand for analyses of 
these kinds, so must find a way of delivering professionally defensible work. 


D. (How) Can We Assure the Methodological Quality of Our Analytical 
Work? 


25 For us, a key consideration is whether we have (or can acquire or co-opt) the 
technical capacity to deliver good quality, professionally defensible work of 
such-and-such a style or using such-and-such a technique. 


26 When we prototype any new analytical product, we have three 
responsibilities: 


e defining and assessing the quality attributes of our product 

@ managing our production process to assure quality 

e making the quality attributes visible, so our product can be used 
intelligently. 


Defining quality for analytical products 


27 The ABS has now agreed on a standard array of quality attributes — 
relevance, coherence, accessibility, interpretability, timeliness and accuracy. (See 
Geoff Lee's paper "Making Data Quality Visible", May 2002.) During recent months, 
we have been reworking our guidance documents on the quality of analytical 
products to align with the ABS standard. This alignment is an essential but not an 
elementary task : 


e The quality of analytical products is affected both by classical statistical 
processes (Such as sampling) and by novel or complex transformations 
(such as multilevel modelling). So we must be concerned with the ways in 
which errors in our raw input data translate into errors in our analytical 
outputs. We must also be concerned with model error (choosing the 
wrong data transformation). 


e Many of our analytical projects draw on multiple or melded datasets. 
Others draw on administrative or transactional databanks. The ABS is 
gradually building up its understanding of the statistical and business 
processes that generate those databanks. 


Assuring quality 


28 Our key strategy for assuring the quality of our analytical products has been 
peer review. Quality flaws may arise in analytical products for many reasons — we 
may use an inappropriate raw dataset or misunderstand its content or limitations; we 
may be ignorant of key subject matter concepts; we may choose an inappropriate 
analytical technique or apply it wrongly; we may misinterpret our results; and so on. 
It is seldom the case that a single quality advisor can steer us clear of all these 
hazards. So we insist that —for almost every analysis project that is prototyping a 
new product, and occasionally for our other projects as well— we recruit a panel of 
advisers and reviewers inside and outside the ABS. 


29 Members of each "peer review panel" are asked to critique our plan of attack, 
our work-in-progress and/or our draft project report. For larger projects, we may 
convene a workshop or a walk-through session. It is impossible to exaggerate the 
value of peer review to us. The rigour of our analyses and the clarity of our 
interpretations have benefited greatly from comments by our internal and external 
reviewers. For lists of recent and current peer review panels, see Attachment 5. 


30 Setting up a peer review process so we get best value from it (and distilling 
the key messages from a round of peer review) is an art that we are still learning. 
Our experience is that we derive best value from the process when the reviewer 
reads our project reports and the project team undertakes a structured walk through 
its data, methods and findings with the reviewer. 


31 We have recently reconsidered somewhat our approach to recruiting external 
reviewers, particularly academics. Some university staff have been very generous 
with their time, especially if one of our projects has piqued their imagination or 
promises to deliver statistics that will advance their own research. We shall continue 
to rely on such good will and energy. But for some key analytical prototypes or for 
projects that involve complex modelling, we are willing to pay an academic 
researcher to spend, say, several days doing a thorough critique of our methods and 
findings. 


32 A kindred scheme that we are working toward is engaging some academic 
experts to sit in our branch as "non-ongoing ABS employees" — each expert 
interacts intensively with one or two of our project teams and conducts a problem 
Clinic for our other projects. 


Making quality visible 


33 Ideally, we wish all of our analytical products to be accompanied by a quality 
declaration and a suite of quality indicators. This would encourage intelligent use. So 
far, we have relied largely on making our analyses transparent — we explain clearly 
the input datasets that we have used, the transformations we have applied, and the 
assumptions on which our data treatments and transformations have relied. Where 
possible, we make both the datasets and algorithms available to users, but this is 
sometimes constrained by our need to protect confidentiality. 


34 Some of our recent products have been accompanied by more thorough 
quality declarations. For example : 


e The papers on the experimental indexes of socioeconomic status for 
Indigenous areas include long discussions of the attributes of the input 
data and an extensive analysis of sensitivity of the indexes to our 
methodological choices (such as choice of index components, unit of 
analysis, level of geography, and so on). 


e The papers on household wealth include long discussions of the attributes 
of the input data and indicators of quality for each asset and liability class 
(based on a mix of hard data and judgments by ABS subject matter 
experts). 


